The analysis we have done will give you a summary level overview of the U.S. Craft Beers and Breweries dataset, that was supplied to us by the CFO and CEO of Budweiser. The overview will contain breweries per state, summary of report on alcohol and IBU content, and correlation between bitterness and alcohol content. After the presentation, you will walk away with a better understanding on areas to focus and the types of beers that are favorable for the consumers.
R-code Explanation:
- Reads in the datasets into a variable
- Counts brewery by State and sorts in descending order
- Mutate the table to create a barplot table
- Create new column to translate fully spelled out states into abbreviations
- Merge the brewery count by state with the map_data from R a table. This is needed to create a heatmap.
Analysis Explanation:
The provided heat map and bar plot will show you that Colorado has 47 breweries, the highest in the US. While states like Washington D.C., North Dakota, South Dakota and West Virginia only have 1 brewery.
| State | CountofBreweries |
|---|---|
| CO | 47 |
| CA | 39 |
| MI | 32 |
| OR | 29 |
| TX | 28 |
| PA | 25 |
| MA | 23 |
| WA | 23 |
| IN | 22 |
| WI | 20 |
| NC | 19 |
| IL | 18 |
| NY | 16 |
| VA | 16 |
| FL | 15 |
| OH | 15 |
| MN | 12 |
| AZ | 11 |
| VT | 10 |
| ME | 9 |
| MO | 9 |
| MT | 9 |
| CT | 8 |
| AK | 7 |
| GA | 7 |
| MD | 7 |
| OK | 6 |
| IA | 5 |
| ID | 5 |
| LA | 5 |
| NE | 5 |
| RI | 5 |
| HI | 4 |
| KY | 4 |
| NM | 4 |
| SC | 4 |
| UT | 4 |
| WY | 4 |
| AL | 3 |
| KS | 3 |
| NH | 3 |
| NJ | 3 |
| TN | 3 |
| AR | 2 |
| DE | 2 |
| MS | 2 |
| NV | 2 |
| DC | 1 |
| ND | 1 |
| SD | 1 |
| WV | 1 |
R-code Explanation:
- Read in the beer dataset
- Merge the beer data with the brewery data
- Create a kable table to output the first and last 6 observations
Analysis Explanation:
In order to get a better idea and analysis on the data we to have merge the U.S. Craft Beers with the Breweries dataset. This allows us to see the breweries in each state, as well as the types of beers, its alcohol content and IBU that each brewery produces. Tables 2 and 3 are an output as a q/a check on the merger of the data. As you can see it lists the brewery in the State along with the details on the beers that it produces.
| Brew_ID | Brewery.Name | City | State | Beer.Name | Beer_ID | ABV | IBU | Style | Ounces |
|---|---|---|---|---|---|---|---|---|---|
| 1 | NorthGate Brewing | Minneapolis | MN | Pumpion | 2689 | 0.060 | 38 | Pumpkin Ale | 16 |
| 1 | NorthGate Brewing | Minneapolis | MN | Stronghold | 2688 | 0.060 | 25 | American Porter | 16 |
| 1 | NorthGate Brewing | Minneapolis | MN | Parapet ESB | 2687 | 0.056 | 47 | Extra Special / Strong Bitter (ESB) | 16 |
| 1 | NorthGate Brewing | Minneapolis | MN | Get Together | 2692 | 0.045 | 50 | American IPA | 16 |
| 1 | NorthGate Brewing | Minneapolis | MN | Maggie’s Leap | 2691 | 0.049 | 26 | Milk / Sweet Stout | 16 |
| 1 | NorthGate Brewing | Minneapolis | MN | Wall’s End | 2690 | 0.048 | 19 | English Brown Ale | 16 |
| Brew_ID | Brewery.Name | City | State | Beer.Name | Beer_ID | ABV | IBU | Style | Ounces |
|---|---|---|---|---|---|---|---|---|---|
| 556 | Ukiah Brewing Company | Ukiah | CA | Pilsner Ukiah | 98 | 0.055 | NA | German Pilsener | 12 |
| 557 | Butternuts Beer and Ale | Garrattsville | NY | Porkslap Pale Ale | 49 | 0.043 | NA | American Pale Ale (APA) | 12 |
| 557 | Butternuts Beer and Ale | Garrattsville | NY | Snapperhead IPA | 51 | 0.068 | NA | American IPA | 12 |
| 557 | Butternuts Beer and Ale | Garrattsville | NY | Moo Thunder Stout | 50 | 0.049 | NA | Milk / Sweet Stout | 12 |
| 557 | Butternuts Beer and Ale | Garrattsville | NY | Heinnieweisse Weissebier | 52 | 0.049 | NA | Hefeweizen | 12 |
| 558 | Sleeping Lady Brewing Company | Anchorage | AK | Urban Wilderness Pale Ale | 30 | 0.049 | NA | English Pale Ale | 12 |
R-code Explanation:
- Calculating the number of NA’s in each column
- Creating bar chart for each NA count
Analysis Explanation:
Further analysis of the data showed that there are missing values in the U.S. craft beer dataset. The below chart shows you the count from each relevant columns. Analysis is only done with the values on-hand.
R-code Explanation:
- Calculate overall summary statistics on the original dataset; mean, median, max, min and 75th quantile
- Create interactive line graph
Analysis Explanation:
It’s important to understand this because if we just look at the maximum ABV and IBU levels. The perception is that Colorado only produce beers with high alcohol content or Oregon only produce beers that are really bitter. It’s the combination of the bar graph and Figure 3 that you start see that Colorado doesn’t exclusively produces beers with high alcohol content, but instead they make a couple of beers with high ABV. That’s why you see the mean of ABV is being pulled to the higher levels, since it is not resistant to outliers.
R-code Explanation:
- Calculating the median ABV and IBU for each state and ignoring the NA’s
- Creating a interactive barchart for the forementioned calculation for each state
Analysis Explanation:
It’s important to look at the maximum, means and medians of the data, so you are able to understand what the consumer wants are for each state. If you look at the graph below you will see the median results for alcohol and bitterness content by state. Here you will start see the different combination of results from ABV to IBU. For example, Washington D.C. has the highest median value of ABV, but Maine has the highest median value for IBU. Is there a correlation?
R-code Explanation:
- Calculating the median ABV and IBU for each state and ignoring the NA’s
- Creating a interactive barchart for the forementioned calculation for each state
Analysis Explanation:
It’s important to look at the maximum, means and medians of the data, so you are able to understand what the consumer wants are for each state. If you look at the graph below you will see the median results for alcohol and bitterness content by state. Here you will start see the different combination of results from ABV to IBU. For example, Washington D.C. has the highest median value of ABV, but Maine has the highest median value for IBU. Is there a correlation?
R-code Explanation:
- Create a interacitve scatterplot with the ABV and IBU data
Analysis Explanation:
Figure 8 is where you see the correlation of a high ABV and its IBU counterpart. As the alcohol content level gets higher, the IBU level tends to go up. There are some outliers where the highest ABV at .125 does not have the highest IBU. The importance here is to look at where the clustering is happening. This is your indication to produce a beer that may have a favorable outcome to the majority of the consumers.
The best action to take when trying to compete with breweries is to understand the amount of different beers and breweries are in each state. Figure 9 gives you that view. Hit the states with a low brewery and beer type count. Also, look at the clustering in Figure 8 to produce the most favorable beer. For instance, most of the clustering happens to be around ABV levels .04 to .06 and IBU levels of 20 to 40. This could mean that its highly in demand or breweries produce it because it’s cheap to make. Further analysis with additional data points like taste type would needs to be done.